A Blueprint tor 
Better Information: 


IX AMANDA JANICE ROBERSON , JAMEY RORISON , AND MAMIE VOIGHT 


October 2017 
: Kei as ae 'F . . \ 
pial 
7A IHEP 


i] it -PP 


Acknowledgments 

This report is the result of hard work and thoughtful contribu- 
tions from many individuals and organizations. We would like 
to thank the Institute for Higher Education Policy staff who 
helped in this effort, including Michelle Asha Cooper, presi- 
dent; Kelly Leon, communications and advocacy officer; Lacey 
Leegwater, senior advisor; Alain Poutré, research analyst; 
Katherine Wheatle, research analyst; and India Heckstall, Tyler 
Wu, and Jack Porter, former policy interns. 


We are appreciative of Sanametrix, specifically Joe Collins, 
Zac Mangold, Christina Tellez, Jerry Malitz, and Brett Richards, 
who created the technical framework for this project. We also 
thank a number of experts and organizations who participated 
in the advisory committee, including: 


Matthew Soldner, American Institutes for Research, 
Committee Chair 

Kazim Ali, Formative Co. 

Sherry Bennet, University of Maryland University College 

Courtney Brown, Lumina Foundation 

Cole Clark, Deloitte 

Afet Dundar, National Student Clearinghouse 

Jennifer Engle, Bill & Melinda Gates Foundation 

Tamar Epstein, Ellucian 

Neal Gibson, formerly of the Arkansas Research Center 


Will Goldschmidt, DBDriven.Net LLC 

Joanna Lyn Grama, EDUCAUSE 

Terrell Halaska, HCM Strategists 

Ron 8S. Jarmin, U.S. Census Bureau 

Christine Keller, Association for Institutional Research 

Amy Laitinen, New America 

Patrick Lane, Western Interstate Commission for Higher 
Education 

Nicole Melander, Civitas Learning 

Ben Miller, Center for American Progress 

Doug Newhard, Hartman Executive Advisors 

Monal Patel, Purdue University 

Neil Ridley, Georgetown University Center on Education and 
the Workforce 

Jeff Sellers, AEM Corporation 

Douglas Shapiro, National Student Clearinghouse 

Kathy Stack, Laura and John Arnold Foundation 

Rachel Zinn, Workforce Data Quality Campaign 


Finally, we are grateful for the Bill & Melinda Gates Founda- 
tion’s support of this and other research on postsecondary 
data. Although many have contributed their thoughts and 
feedback throughout the production of this report, the 
research and recommendations presented here are those of 
the authors alone. 


Executive Summary 

Research shows that investing in a college education pays divi- 
dends for both students and society.' But, persistent gaps in 
college access, success, and attainment prove that not 
everyone shares the same benefits. Outcomes vary within and 
across institutions, yet we often don’t know which programs at 
which institutions provide a return on investment, and for which 
students. The data that students, institutions, and policymakers 
currently use for decision-making are disconnected, duplica- 
tive, and incomplete. To better serve students, the higher educa- 
tion system needs a data infrastructure that reduces complexity 
while measuring the outcomes of all students—especially those 
who have been traditionally underserved. 


This brief builds on the 2016 paper series Envisioning the 
National Postsecondary Infrastructure in the 21st Century” by 
exploring a federal student-level data network (SLDN) through 
both technical and policy lenses. We know policymakers are 
interested in creating a workable SLDN, as evidenced by the 
introduction of the bipartisan College Transparency Act of 
2017 (CTA) and previous iterations of the Student Right to 
Know Before You Go Act.* This paper, therefore, reinforces the 
case for building an SLDN while describing the technical, 
operational, and governance requirements needed to success- 
fully and securely design and implement such a system.® 
Figure ES1, below, shows our key policy recommendations for 
an SLDN in three primary categories: Operations & Capacity, 
Data Governance, and Privacy & Security. 


The creation of a federal SLDN would transform the higher 
education system. If higher education stakeholders had access 
to connected and complete data, and used it thoughtfully, more 
students would have a true opportunity to succeed in college. 
Policymakers could make more informed decisions about 
federal and state investments in higher education. Institutions 


could benchmark performance and create programs that 
benefit students. Students and families could choose a best-fit 
college. And researchers could provide a data-driven founda- 
tion for these stakeholders. High-quality data is an essential, yet 
often overlooked, component of most policy areas. 


However, the federal government did not design its systems to 
collect data to answer all of the questions pertinent to today’s 
student population. For example, the most commonly used 
graduation rate only measures the percentage of first-time, 
full-time students who complete their degree or credential at 
their first institution within six years, leaving out part-time and 
transfer students. As a result, this rate reflects only about 47 
percent of today’s students entering college. The federal 
government collects data at different levels (e.g., institution- 
level, student-level, loan-level), and its data systems cannot 
currently connect to each other to provide a clear picture of 
the complex and varied pathways students take to, and 
through, higher education—pathways that we now know are 
highly complex. Given rising college costs, increased interest 
about return on investment for families and the government, 
and a lack of clear information about 21st century students’ 
outcomes, we must take action to improve our national data 
infrastructure and put useful information into the hands of 
decision makers. 


Students, policymakers, and institutions need a limited, core 
set of consistent, institution- and program- level metrics that 
serve specific policy, consumer information, and institutional 
improvement purposes. This brief provides the necessary 
guidance to design and implement a secure data system that 
protects student privacy, while equipping key stakeholders 
with the information they need to make equitable, student 
success-focused decisions. 


Figure ES1: Key Policy Recommendations for a Federal Student-Level Data Network (SLDN) 


RECOMMENDATIONS 


Authorize creation of a federal SLDN. 


Operations 
and Capaci 


vvvyey 


Data OUO 


Leverage existing federal and institutional data to count all students and all outcomes. 
Replace components of existing data collections with a federal SLDN. 
Shift staffing and data system resources from Integrated Postsecondary Educations Data System (IPEDS) to the SLDN. 


> Include key stakeholders on the data governance team to inform data integrity, management, and privacy. 
> Coordinate on the development and use of unique identifiers to align data across collections. 


Governance 
| ) | > Adapt best practices from existing data sharing efforts. 


> Follow data minimization principles to limit data included in the system to only the necessary elements, retain data only as long as 
needed, and restrict its use to educational purposes. 


Privacy and, \ > Host the system in a statistical agency and require adherence to strict privacy and security laws and standards, including conducting 


Security | 


> Implement clear role-based access protocols. 


routine audits, using encryption technology, and following relevant standards and practices from the National Institute for Standards and 
Technology, the Fair Information Practice Principles, and other leading protocols. 
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Introduction 

Research shows that investing in a college education is worth 
it.” Persistent equity gaps in college access, success, and 
attainment, however, prove that these investments don’t pay 
off equally for all students at all institutions. We need to know 
which programs at which institutions provide a return on 
investment, and for which students. Outcomes vary across 
institutions, yet the data that students, institutions, and policy- 
makers currently use to make decisions are incomplete and 
insufficient. Better use of high-quality data will improve student 
success, especially for underserved students who too often 
are left out of data metrics and systems. 


In the current system, institutions report data—sometimes very 
similar data—to a variety of entities, including states, accred- 
iting agencies, voluntary data initiatives, and the federal govern- 
ment. Within the federal government, the Integrated 
Postsecondary Education Data System (IPEDS) includes insti- 
tution-level, aggregate data on student access, progression, 
price, and completion at each participating institution. The 
National Student Loan Data System (NSLDS) is comprised of 
student- and loan-level data on federal student aid recipients, 
and the Department of Veterans Affairs (VA) and Department of 
Defense (DoD) hold administrative data on students receiving 
veteran education benefits. For the most part, these data 
systems do not communicate, creating a confusing and 
disjointed data infrastructure that produces incomplete metrics. 


Because data are collected at different levels (e.g., student- 
level, loan-level, institution-level) and there have been few efforts 
to match data across systems, our current framework cannot 
answer critical questions about college enrollment, completion, 


Figure 1: A Field-Driven Metrics Framework 


costs, and outcomes. Without clear answers about outcomes 
for today’s students, it is difficult for students to make informed 
choices, for policymakers to promote equitable access to and 
success in higher education, and for institutions to implement 
data-driven reforms. Equity and success for all students will 
remain elusive until policymakers, students, and institutions 
have the proper information to make decisions. 


Over the past ten years, the higher education field has estab- 
lished key metrics to advance student success by measuring 
postsecondary performance, efficiency, and equity. Using 
examples from voluntary data collections, IHEP constructed a 
postsecondary metrics framework that reflects a decade of 
progress in defining metrics and using data. Figure 1, below, 
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Source: Toward Convergence: A Technical Guide for the Postsecondary Metrics Framework: hitp://www.ihep.org/sites/default/files/uploads/postsecdata/docs/resources/ihep_toward_convergence_low_2b.pdf 
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shows the recommended metrics necessary to meet the 
needs of all relevant stakeholders.® This limited, core set of 
consistent metrics serves specific policy, consumer informa- 
tion, and institutional improvement purposes, while focusing 
on equity in student access, progression, completion, cost, 
and post-college outcomes.? These metrics would be reported 
and used at the program- and institution-level, aggregated 
from the student-level data matched by the system. 


Policymakers recognize that improving our national postsec- 
ondary data infrastructure is essential. Accordingly, they have 
shown ongoing interest in improving the quality of information 
relied upon by both consumers and the policymakers charged 
with stewarding the federal government’s substantial invest- 
ment in higher education. Legislation like the Student Right to 
Know Before You Go Acts of 2012, 2013, 
and 2015, as well as the College Transpar- 
ency Act of 2017 (CTA), identify a secure, 
privacy-protected federal student-level 
data network (SLDN) as the most effective 
way to correct the duplicative, discon- 
nected, and incomplete nature of the 
current postsecondary data systems."° 


decisions. 


This brief describes operations and capacity, data gover- 
nance, and privacy and security recommendations for 
designing and implementing an SLDN. It also answers basic 
questions about what a federal SLDN would look like: where it 
would operate, who would submit data, who would have 
access to the data, and who would govern the overall system. 
The data governance team creates rules, processes, and 
procedures in collaboration with the Commissioner of the 
National Center for Education Statistics (NCES). Only those 
with clearance and training would have access to the more 
granular data—most data users would receive only aggregate 
institution- or program-level data. 


A federal SLDN would streamline the way institutions report 
data to the federal government, while increasing the quality 
and usability of the resulting information. Currently, every Title 
IV-participating institution uses student-level data to calculate 
aggregate metrics, like graduation rates, for IPEDS. The 
number of metrics that institutions submit can vary, based on 
institution type and level. Many institutions must calculate 
upward of 500 metrics, including, but not limited to, enroll- 
ment, completion, and pricing metrics, disaggregated by a 
variety of subsets of students. By contrast, under a federal 
SLDN, institutions would instead securely report the student- 
level data they already hold to the Department of Education 
(ED). ED would then use the data to calculate and report the 
aggregate metrics, allowing the SLDN to replace portions of 
IPEDS. This system would reduce institutional reporting 
burdens, while allowing ED to calculate even more compre- 
hensive and useful metrics. 


Equity and success for all 
students will remain elusive 
until policymakers, students, 


and institutions have the 
proper information to make 


For example, under a federal SLDN, student-level data on 
race/ethnicity would be reported through the system to allow 
ED to calculate the IPEDS enrollment and graduation rates by 
race/ethnicity. Without any additional reporting—but informed 
by field inpbut—ED could disaggregate other measures, such 
as the IPEDS Outcome Measures, by race/ethnicity. The 
student-level data would never be shared publicly, but it would 
provide a more robust underpinning to the public, institution- 
level metrics. This more complete and efficient system would 
allow for more comprehensive, equity-focused analyses of 
aggregate data. 


To supplement data submitted by institutions, the SLDN would 
alleviate duplicative reporting by matching student-level data 
from institutions to data already collected by the federal 
government. Agencies like the Department 
of the Treasury, Social Security Administra- 
tion, Office of Federal Student Aid (FSA), 
DoD, and VA would enter data sharing 
agreements with NCES to produce aggre- 
gate reports on post-college workforce 
outcomes, the cost of a college degree, 
and student veteran and servicemember 
access and success. NCES is _ well- 
equipped to house this type of system because of its status as 
an independent, statistical agency and long history managing 
sensitive data. State and federal policymakers, institutions, 
and students could then leverage these data when making 
decisions about investment and improvements in higher 
education. 


More specifically, data feedback loops to institutions and 
states would provide valuable information on students’ path- 
ways to, and through, other institutions, as well as students’ 
post-college outcomes. Colleges and universities want to 
know how their students fare after leaving their institution— 
either to pursue subsequent education or to enter the work- 
force. Right now, however, schools have limited access to 
information that would answer questions about how prepared 
their students are to succeed after transfer or graduation. Simi- 
larly, states want to understand the development, retention, 
and flow of human capital to help strengthen their economies 
and tailor policies to student and workforce needs. Aggregate 
data or custom queries from an SLDN could answer these 
types of questions for both states and institutions in more 
complete and comprehensive ways than existing data. 


This paper describes the steps toward an infrastructure design 
where data are secure, student privacy is protected, and policy- 
makers, institutions, and students all have the aggregate institu- 
tion- or program-level information they need to make informed 
decisions. Improvements to the infrastructure would provide an 
opportunity to count all postsecondary students, all outcomes, 
and all institutions, while advancing equity through the use of 
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disaggregated metrics. This paper provides an overview of the 
technical, governance, and capacity requirements for a federal 
SLDN, exploring design considerations and policy recommen- 
dations in three key categories: Operations & Capacity, Data 
Governance, and Privacy & Security. 


Recommendations for a Federal Student-Level 

Data Network 

While members of the postsecondary policy and advocacy 
community"! have underscored the policy imperative and 
recommended strategies to create a more cohesive data 
ecosystem, we have not yet fully explored the operational and 
technical requirements of an improved system.'? Figure 2, 
below, describes these considerations. 


Operations and Capacity 


For a data system to function properly, ED needs to assess its 
internal capacity and staffing to provide the daily resources 
necessary for operation. Policymakers will need to address 
funding and internal restructuring in order to implement a 
federal student-level data network. Recommendations for 
doing so are laid out in detail below. 


1. Authorize the creation of a federal SLDN. The creation of 
federal SLDN is currently prohibited by federal statute, so 
Congress needs to act to allow the system to be built. The 
recently introduced College Transparency Act would overturn 
the federal prohibition and create a federal SLDN. The Act has 
the support of over 90 organizations that recognize the need 
for improved data systems to promote student success.'* 
Several institutional associations are included among these 


supporting organizations, signaling the support of many 
colleges and universities. 


2. Leverage existing federal and institutional data to count 
all students and all outcomes. A primary goal of a federal 
SLDN is to count all students and all outcomes to accurately 
represent the state of today’s postsecondary system. Histori- 
cally, federal collections have only focused on traditional 
student cohorts (e.g., first time, full time students who enroll in 
the fall semester, or students who receive federal financial 
aid).'* Higher education metrics should also include part-time, 
transfer, and non-aided students, and measure all pathways 
and outcomes, such as transfer, completion after transfer, and 
workforce outcomes.'® 


To calculate these metrics, ED should leverage existing federal 
and institutional data. If the federal government already holds 
data, such as for workforce outcomes, federal financial aid 
receipt, or veteran education benefits, then institutions should 
not be required to report that information. Instead, institutions 
should report information not already held by the federal govern- 
ment, such as enrollments, pricing, and completions. This type 
of information is needed to calculate IPEDS metrics, and will 
help fill reporting gaps. Institutionally-reported data would be 
periodically and securely matched with federally-held data to 
produce useful, aggregate information. For instance, institu- 
tions would report student-level enrollment and completion 
data to NCES, which would combine the data into cohorts. 
NCES would securely transmit individual-level data, grouped 
into the pre-determined cohorts, to the Department of the Trea- 
sury, which would match the data with earnings information. 
Treasury would send aggregate workforce information back to 
NCES on the cohorts, but would not transfer earnings data 
about individual students. After sending the aggregate data to 


Figure 2: Key Policy Recommendations for a Federal Student-Level Data Network (SLDN) 
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Leverage existing federal and institutional data to count all students and all outcomes. 
Replace components of existing data collections with a federal SLDN. 
Shift staffing and data system resources from Integrated Postsecondary Educations Data System (IPEDS) to the SLDN. 


> Include key stakeholders on the data governance team to inform data integrity, management, and privacy. 
> Coordinate on the development and use of unique identifiers to align data across collections. 


Governance 
i | in > Adapt best practices from existing data sharing efforts. 


> Follow data minimization principles to limit data included in the system to only the necessary elements, retain data only as long as 
needed, and restrict its use to educational purposes. 


Privacy and, (a) > Host the system in a statistical agency and require adherence to strict privacy and security laws and standards, including conducting 


Security 


> Implement clear role-based access protocols. 


routine audits, using encryption technology, and following relevant standards and practices from the National Institute for Standards and 
Technology, the Fair Information Practice Principles, and other leading protocols. 
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NCES, Treasury should destroy the data post-matching by 
following NIST guidelines for media sanitization."® 


3. Replace components of existing data collections with a 
federal SLDN. Large portions of IPEDS, such as the Fall Enroll- 
ment, 12-Month Enrollment, Graduation Rates, 200% Gradua- 
tion Rates, Outcome Measures, Completions, Student Financial 
Aid, and portions of the Admissions surveys, could be replaced 
by a federal SLDN. The governance team, data experts, and 
federal agencies should examine existing data collections and 
replace components that would be duplicated by a student- 
level system. 


4. Shift staffing and data system resources from IPEDS to 
the SLDN. The SLDN will require staff, software, hardware, 
training resources, and mechanisms for engaging with the 
governance team. To ensure comprehensive data governance 
and compliance with privacy and security standards, funding 
and resources should shift from the IPEDS collection to the 
SLDN, as the system will fulfill the student components of 
IPEDS reporting. ED will need to make internal capacity adjust- 
ments to phase out old collection mechanisms, phase in new 
systems, conduct trainings, respond to inquiries from reporting 
institutions and states during the transition, and convene a 
data governance team to assure the success of the system. 


)) Data Governance 


Data governance is a systemic process that formalizes how 
people interact—or do not interact—with data systems." This 
process requires technical input and expertise, and should be 
an integral facet of the data infrastructure strategy. ED should 
form a data governance team early in the planning phase to 
manage the integration of complex education and workforce 
data systems for the federal SLDN. Statute should require that 
this team involve representation from each of the agencies 
involved in the network, including leaders like chief informa- 
tion officers, privacy officers, and database administrators; 
experts in privacy and security, postsecondary data quality, 
and consumer protection; governmental and non-govern- 
mental researchers; institution, state, and agency representa- 
tives; and data users, like students and policymakers. With 
diverse representation of subject matter and policy experts, 
the governance team will be able to develop policies and 
processes that create a sustainable and flexible data network. 


Once assembled, the data governance team would be respon- 
sible for the research and execution of memorandums of 
understanding (MOUs) and data sharing agreements between 
multiple stakeholders. The team would also be responsible for 
establishing a comprehensive governance program that 
ensures security, privacy, confidentiality, integrity, and appro- 


priate accessibility of the data, including the development of a 
data dictionary. To accomplish this non-exhaustive list of tasks 
for the governance team, policymakers should consider the 
following recommendations. 


1. Include key stakeholders on the data governance team 
to inform data integrity, management, and privacy. An 
SLDN’s governance team should require engagement with 
the key stakeholders and prioritize, first and foremost, the 
needs of students. The governance team should include 
representatives of students and families, consumer protec- 
tion advocates, institutions, states, privacy and security 
experts, postsecondary data experts, and data users, such 
as policy analysts and researchers. This team should guide 
system design and implementation by defining all aspects of 
data management and use. By leveraging the expertise and 
experience of these groups, the data governance team 
should work with developers to create a system that meets 
statutory requirements, agency parameters, privacy and 
security protocols, and consumer information needs. The 
governance team should also convene privacy and security 
experts to regularly review and audit data and technology 
standards. The team should also create data access policies, 
including guidelines for developing a public tool (similar to 
NCES PowerStats) and restricted use licenses for researchers, 
as well as data minimization, retention, destruction, and 
breach protocols. 


2. Coordinate on the development and use of unique iden- 
tifiers to align data across collections. A unique identifier for 
student-level data plays the pivotal role of allowing these data 
to be matched across collections. While the data matching 
process comes with the aforementioned privacy and security 
considerations, accurate matches require an identifier or a 
series of identifiers. A federal data network can learn from 
states and other data initiatives that have securely matched 
data across sources. For instance, the Western Interstate 
Commission for Higher Education’s (WICHE) Multistate Longi- 
tudinal Data Exchange (MLDE) leverages the National Student 
Clearinghouse’s (NSC) proprietary identity resolution process 
to match persons between education and workforce data 
sources, like state departments of education, labor market 
information offices, and NSC.'® Similarly, the Arkansas 
Research Center utilizes a dual-database architecture to 
match records while protecting students’ identities. Under this 
approach, personally identifiable information (PII) is secured 
in one database and matched with a temporary unique, 
random identifier to align with other databases.'? Federal 
agencies can use these and other models to create data 
sharing agreements and processes that protect student 
privacy and secure PIl as carefully as possible. 


3. Adapt best practices from existing data sharing efforts. 
Despite the lack of a comprehensive, national data network, 
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some states, institutions, and federal agencies have under- 
taken efforts to match data as part of their decision-making 
processes. Lessons learned from these initiatives in metric 
alignment, capacity requirements, and MOUs can inform the 
changes needed to implement a federal SLDN. For example: 


* The College Scorecard required the coordination of large 
administrative datasets from within ED and the U.S. Depart- 
ment of the Treasury. Treasury matched student-level data 
from FSA with earnings data from administrative tax records 
for federally-aided students and shared aggregate, institu- 
tion-level results with ED to match institution-level IPEDS 
and FSA data.2? A national SLDN would build similar 
matches to allow workforce outcome metrics to be calcu- 
lated for aided and non-aided students. ED and Treasury 
can leverage lessons learned and MOUs from this initial 
data sharing to implement the improved data network. 


* Some states are participating in data sharing initiatives to 
understand student outcomes while accounting for 
student mobility. WICHE’s MLDE connects four state data 
systems to measure wage and employment outcomes 
across states. This initiative provides valuable insight into 
challenges for data sharing that arise when different agen- 
cies work together to fill gaps in knowledge.”' The data 
governance team should use the lessons learned in 
establishing complex MOUs and navigating state legisla- 
tion on student data privacy and security to inform its 
process.” 


¢ The University of Texas (UT) System partnered in fall 2016 
with the U.S. Census Bureau to connect 
education data with salary and jobs 
data from the Longitudinal Employer- 
Household Dynamics (LEHD) 
program.” This partnership includes a 
10-year agreement and provides both 
Census and UT with data that connect 
degree attainment with labor market 
outcomes. These data augment UT’s abilities to under- 
stand the impact of higher education on their students who 
stay in-state, as well as those who move out of Texas, and 
is intended to inform policy decisions on student debt, 
program assessment, and student advising initiatives. 


¢ To provide veterans with better information about post- 
secondary opportunities, the Department of Labor, DoD, 
ED, and VA have experimented with collaboration and 
data sharing.“ These are some of the primary agencies 
that would need to coordinate to implement a compre- 
hensive system that includes all student- and policy-rele- 
vant metrics, especially to meet the needs of veterans and 
simplify the reporting on data about service members’ 
and veterans’ use of financial aid. 


Protecting the privacy and 
security of student data 
must remain the top priority 


for those who report, collect, 
or aggregate the data. 


Privacy and Security 


The value proposition for a federal SLDN is clear: it will close 
gaps in current data systems and, in turn, will answer stake- 
holder questions about access, success, and outcomes for all 
students while providing the information necessary to protect 
taxpayer investment in higher education. With answers to 
these questions, students will be able to make more informed 
decisions, policymakers will be able to drive changes in policy 
and practice to protect both students and taxpayers, and insti- 
tutions will be able to tackle continuous improvement efforts. 
All of this will result in more equitable and improved student 
outcomes. New data matches will require thoughtful and 
ongoing attention to privacy and security, but federal agencies 
will be able to mitigate risk by leveraging industry best prac- 
tices and complying with relevant federal information tech- 
nology (IT) system requirements, including the Federal Trade 
Commission’s Fair Information Practice Principles (FIPPs) and 
the standards set by the National Institutes of Standards and 
Technology (NIST).2°> Protecting the privacy and security of 
student data must remain the top priority for those who report, 
collect, or aggregate the data. They must ensure that the 
SLDN adheres to all laws governing data collection and use, 
including those listed in Figure 3, on the next page. 


Continuously updating and properly executing privacy and 
security protocols will protect student data from misuse and 
unauthorized access. The data governance team should 
routinely revisit the system’s privacy and security protocols, 
and consult with experts to make improve- 
ments. In addition, the data network itself 
should be audited on a regular basis, and 
encryption technology should be used to 
protect and secure the data. As technology 
to protect and secure data improves, so 
should the SLDN. Furthermore, data in an 
SLDN should not be used to punish or take 
corrective action against students or for corporate gain. The 
following recommendations detail ways to protect student 
data while implementing a more robust data network. 


1. Follow data minimization principles to limit data included 
in the system to only the necessary elements, retain data 
only as long as needed, and restrict its use to educational 
purposes. Data minimization is a key privacy principle. A 
federal SLDN should contain only the data elements that are 
needed to answer questions of national importance about 
college access, success, cost, and outcomes. The data should 
be restricted to educational purposes and retained only as 
long as necessary. The data governance team should deter- 
mine which elements to include in the network to serve the 
purposes outlined in statute like consumer information, policy- 
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Figure 3: Laws and Agencies Related to Student Data Privacy and Security 


The Family Educational 
Rights and Privacy Act of 
1974 (FERPA) 


The Privacy Act of 1974 


Confidential Information 
Protection and Statistical 
Efficiency Act (CIPSEA) 


Higher Education Act 
(HEA) 


The Federal Information 
Security Management Act 
of 2002 (FISMA) 


E-Government Act of 2002 


The Gramm-Leach-Bliley 
Act (GLBA) 


The Fair and Accurate 
Credit Transaction Act of 
2003 (FACTA) 


Fair Information Practice 
Principles (FIPPs) 


FERPA protects the access and use of student data by all educational agencies and institutions that receive federal funding. Once a student 
attends a postsecondary institution, the rights formerly provided to the parent are transferred to the student. FERPA also supports the 
protection of personally identifiable information (PIl). 


The Privacy Act of 1974 protects the privacy of records created and used by the federal government. Accordingly, it would apply to all data 
stored within a federal SLDN. In addition to stating the rules that the government must follow when collecting data about a person, the law 
ensures that the government cannot disclose data about a person without that person’s permission unless the disclosure meets one of the 
12 broad statutory exemptions outlined in the statute. 


CIPSEA provides strong confidentiality protection to data used for statistical purposes. It protects the data from law enforcement, taxation, 
and regulatory use. 


HEA authorizes numerous federal aid programs that provide support to individuals and institutions. HEA provisions govern the permissible 
use of data collected through the Free Application for Federal Student Aid (FAFSA) and in the National Student Loan Data System (NSLDS). 


FISMA applies to federal IT systems and other IT systems that hold federal data. It requires those systems to adhere to common national 
standards regarding information security protection and to utilize a risk-based approach when securing data. Annual reviews through the 
Office of Management and Budget are also required to ensure the security program in place is adequate. 


The E-Government Act of 2002 protects how data are collected, stored, and used in a federally-held IT system. The law requires a privacy 
impact assessment to be completed, as well as the posting of privacy notices regarding a system’s data collection practices. 


The GLBA impacts how institutions collect, store, and use student financial records containing PIl. 


FACTA also impacts how institutions collect, store, and use student financial records containing Pll. 


The Federal Trade Commission’s FIPPs are guidelines for how entities collect and use personal information, and the safeguards they use to 
assure adequate privacy protection. These principles are part of the Privacy Act of 1974 (see above). 


National Institute of 


Standards and Technology 


(NIST) 


Office of Inspector General 


(OIG) at ED 


Privacy Technical 
Assistance Center (PTAC) 


This agency resides within the U.S. Department of Commerce and develops publications and standards for privacy and security as part of 
its statutory responsibility under FISMA. 


ED’s Office of Inspector General promotes efficiency, effectiveness, and integrity of ED operations and programs. OIG conducts the annual 
FISMA compliance audit for ED. 


This center resides within ED and serves as a “one-stop” resource for education stakeholders to learn about data privacy, confidentiality, 
and security practices related to student-level longitudinal data systems and other uses of student data. PTAC also provides training 
materials and opportunities to receive direct assistance in the aforementioned topics.” 


Source: Selected sections of table are excerpted from Understanding Information Security and Privacy in Postsecondary Education Data Systems (Grama, 2016). htip://www.ihep.org/sites/default/files/ 
uploads/postsecdata/docs/resources/information_security_and_privacy.pdf 


making, and institutional improvement. A proposed limited list 
of data elements is included in the appendix to this paper. A 
federal data system should not include sensitive data elements 
on students or their families, including, but not limited to, those 
related to health, social-emotional, immigration, or disciplinary 
records. Sensitive elements such as these should be statuto- 
rily prohibited from being collected. 


2. Host the system in a statistical agency, and require 
adherence to strict, industry-leading privacy and security 
standards, including conducting routine audits, using 
encryption technology, and following relevant standards 
and practices from NIST, FIPPs, and other leading proto- 
cols. ED includes several offices that collect, process, and 
analyze data at different levels. The NCES, the proposed 
agency to house the federal SLDN per the College Transpar- 
ency Act, adheres to statistical agency standards and laws 
and has maintained secure databases without incident for 
decades. As part of ongoing system audits and reevaluations, 
ED leadership should review and enhance privacy and secu- 


rity protocols in accordance with applicable legislation, like 
FISMA and CIPSEA, and emerging best practices in the field, 
including encryption technologies. Additionally, agencies 
involved in data sharing should leverage information security 
publications from NIST to ensure that they maintain the privacy 
of all students while keeping their data secure and protected 
from internal and external breaches. 


To ensure that ED and other participating federal agencies 
safeguard student data, policymakers should require agen- 
cies to adhere to strict privacy and security standards for data 
matching, use, and storage, including applying penalties for 
willful violations. Such a requirement would signal the serious- 
ness of privacy and security standards at the federal level. Any 
applicable statute, however, must permit standards to evolve 
with developing technologies and emerging threats. Statutory 
language, therefore, should not be prohibitively explicit. For 
example, policy should require the data network to comply 
with the relevant portions of federal and industry-leading stan- 
dards, such as those delineated by NIST, FISMA, and FIPPs. 
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3. Implement clear role-based access protocols. Role- 
based access to data ensures that the right people will have 
access to the right data. The data governance team is respon- 
sible for creating access guidelines for the system. The team 
could suggest using technology like multifactor authentication 
to safeguard data from improper access and use. 


With a framework for which metrics to collect (see Figure 1), 
the federal government needs to reconstruct the national data 
infrastructure to facilitate the collection and use of these data. 
Policymakers and federal agencies should collaborate to 
improve data matching and utilize the data they already have 
for decision-making. These matches, however, will take time to 
implement. Creating and maintaining a federal SLDN will 
involve five phases (see Figure 4, below): 


1. In the planning phase, the governance team will create the 
procedural foundation for the system. They will outline the 
processes needed to implement the system, including the 
data sharing agreements required to collect and use data. The 
governance team will use the planning phase to ensure that 
the system meets the statutory requirements in the Higher 
Education Act for federal data collection. The team will also 


confirm data sharing agreements across participating federal 
agencies. As a result, this could be the most time-consuming 
and most politically complicated of the five project phases. 


2. The requirements gathering phase will lay the technical 
groundwork to build a secure system by defining system roles 
and privileges for users, as well as creating guides for users at 
all access levels. During this time the data governance team 
will actively gather the input of key stakeholders on the design 
and uses of the system by convening interagency workgroups 
and review panels. 


3. In the development phase, system architects will build the 
system and test it with end users and data providers, such as 
institutions and federal agencies. Architects will construct how 
the data systems across agencies will match and integrate 
data. They will also determine which mechanisms are best for 
data collection and reporting, and then test these assumptions 
with the data governance team and workgroups. 


4. The implementation phase will include one year of trouble- 
shooting with end users and data providers to solve technical 
issues around data input, reporting, and management. ED will 
also spend this year building staff capacity and training them on 


Figure 4: Phases for Creating and Implementing a Student-Level Data Network 


Planning Phase Requirements 


Gathering Phase 


Development 
Phase 


Maintenance 
Phase 


Implementation 
Phase 
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the intricacies of the system. This phase readies the system for 
use. At the conclusion of the phase, ED will deploy the system. 


5. The maintenance phase will continue indefinitely. It initiates 
data collection and includes processes for support and system 
improvements, including the refinement of data elements in 
the system. 


Conclusion 

Better data are a necessary tool for helping more students, 
especially low-income students and students of color, succeed 
in higher education. Upgrades to the national postsecondary 
data infrastructure are needed to answer questions that will 
allow for improved student outcomes through policymaker, 
practitioner, and student action. A student-level data network 
will bring the postsecondary data infrastructure closer to the 
ideal, where state and federal policymakers, students and 
families, and institutions all have the information they need to 
make important decisions and close equity gaps. The creation 
of an SLDN will require significant federal action and inter- 
agency collaboration. Nonetheless, a failure to do so will 
hamper broad, bipartisan efforts to enhance student choice, 
transparency, and improved student outcomes. Students can 
wait no longer for the information and transparency that a 
more efficient and effective data system can provide. 
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Appendix: Elements to Include in a Federal Student-Level Data Network 


DATA ELEMENT METRIC SUPPORTED 


Social Security Number 
Unique student ID 
Name 

Date of birth 

State of residency 
Race/ethnicity 

Gender 


Family Income 
Military/Veteran Status 


First generation college student status 
Institution ID 


Date first enrolled 


Enrollment by term 


Credential-seeking status 


Transfer Status 

Program of Study (CIP Code) 
Enrollment mode 
College-ready status 
Number of credits attempted 


Number of credits completed 


Completion of math gateway course in first year 
Completion of English gateway course in first year 
Date of credential award 


Credential award level 

CIP Code of credential award 

FAFSA submission flag 

EFC 

Tuition and fees 

In-state (or in-district) tuition eligibility 
Room and board 

Living arrangement 

Books and supplies 

Other expenses 


Grant amounts, by source (federal, state, institutional) 
and type (need/non-need-based) 


Loan amounts, by source (federal, state, institutional, 
private), type (Subsidized Stafford, Unsubsidized 
Stafford, Perkins, PLUS), and interest rate 


Work-study amounts (by source) 
Military benefits amount (by source and type) 


Cumulative debt 

Date entering repayment 
Loan payment amount 
Remaining debt 
Repayment status 


Annual earnings 


Record matching 

Record matching 

Record matching 

Age at time of entry; Record matching 
IPEDS enrollment; Cost of Attendance 
Race/ethnicity disaggregate 

Gender disaggregate 

Economic status disaggregate 


Military status disaggregate 


First generation disaggregate 


Record matching; Calculation of institution-level metrics 


IPEDS yield rate; Cohort determination; Time to 
Credential; Enrollment 


Retention; Persistence 


Cohort determination; Disaggregate; Graduate 
Education Rate 


Enrollment status 

Program of study selection; Disaggregates 
Modality 

Academic preparation disaggregate 


Credit accumulation; Credit completion ratio; Credits to 
degree; Attendance intensity (Full-time, part-time) 


Credit accumulation; Credit completion ratio; Credits to 
degree; Attendance intensity (Full-time, part-time) 


Gateway Course Completion 
Gateway Course Completion 


Graduation rate; Completers; Time to credential; IPEDS 
Degrees conferred 


IPEDS degrees conferred; Disaggregate 
IPEDS degrees conferred; Disaggregate 
Financial aid application 

Net Price; Unmet Need 

Cost of attendance; Net Price; Unmet Need 
Cost of attendance; Net Price; Unmet Need 
Cost of attendance; Net Price; Unmet Need 
Cost of attendance; Net Price; Unmet Need 
Cost of attendance; Net Price; Unmet Need 
Cost of attendance; Net Price; Unmet Need 


IPEDS grant awards; Net Price; Pell receipt/Economic 
Status disaggregate 


IPEDS loan amounts 


Net Price; Unmet Need 


Gl Bill awards; Tuition Assistance Program awards 


Cumulative Debt; Repayment rate 

Repayment rate 

Repayment rate 

Repayment rate 

Cohort default rate; Repayment rate 
Employment rate; Earnings; Earnings threshold 


WHO REPORTS?/ 
DATA SOURCE 


All linked sources 

Created for system operation 
Institution 

Institution 

Institution 

Institution 

Institution 


FSA (for aided); 
Institution (for non-aided) 


Departments of Defense and 
Veterans Affairs 


FSA and Institution 


IPEDS/Office of Postsecondary 
Education at ED 


Institution 


Institution 


Institution 


Institution 
Institution 
Institution 
Institution 


Institution 
Institution 


Institution 
Institution 


Institution 


Institution 
Institution 
FSA 

FSA 
Institution 
Institution 
Institution 
Institution 
Institution 
Institution 
FSA and Institution 


FSA and Institution 


FSA and Institution 


Departments of Defense and 
Veterans Affairs 


FSA 
FSA 
FSA 
FSA 
FSA 
Department of Treasury 
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REQUIRED FOR 
IPEDS REPORTING 
n/a 


n/a 


n/a 
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The Institute for Higher Education Policy (IHEP) is a nonpartisan, nonprofit organization committed to promoting access to and success in higher 
education for all students. Based in Washington, D.C., IHEP develops innovative policy- and practice-oriented research to guide policymakers and 
education leaders, who develop high-impact policies that will address our nation’s most pressing education challenges. 


